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Abstract 

The synthesis of a library of tetrahydro-P-carboline-containing compounds in milligram quantities is described. Among the unique 
heterocyclic frameworks are twelve tetrahydroindolizinoindoles, six tetrahydrocyclobutanindoloquinolizinones and three tetra- 
hydrocyclopentenoneindolizinoindolones. These compounds were selected from a virtual combinatorial library of 11,478 com- 
pounds. Physical chemical properties were calculated and most of them are in accordance with Lipinski's rules. Virtual docking and 
ligand-based target evaluations were performed for the (3-carboline library compounds and selected synthetic intermediates to assess 
the therapeutic potential of these small organic molecules. These compounds have been deposited into the NIH Molecular Reposi- 
tory (MLSMR) and may target proteins such as histone deacetylase 4, endothelial nitric oxide synthase, 5-hydroxytryptamine 
receptor 6 and mitogen-activated protein kinase 1 . These in silico screening results aim to add value to the (3-carboline library of 
compounds for those interested in probes of these targets. 



Introduction 

Identification of a comprehensive set of small organic mole- 
cules capable of selectively modifying the function of bio- 
logical targets tremendously impacts modern medical research 
and drug discovery efforts [1]. Currently, this set of small mole- 



cules is largely occupied by in-house libraries and commer- 
cially available compounds. The NIH Roadmap initiative was 
established to address a recognized limitation of current com- 
pound diversity resulting in the Molecular Libraries Probe 
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Centers Network (MLPCN) which has, since its inception, 
garnered a library of over 370,000 chemically diverse small 
molecules in a central molecule repository [2]. This supply of 
compounds has been made possible by researchers across the 
disciplines, but largely by synthetic chemists who are preparing 
compounds with an eye towards biologically relevant targets. 
Another goal of the NIH Roadmap is the development of 
enabling methods for the synthesis of these structurally diverse 
compound libraries; amongst these methods, skeletal diversifi- 
cation strategies have emerged as particularly efficient for 
maximizing structural diversity [3]. 

Previous work in the Brummond laboratory has demonstrated 
that an allene-containing |3-carboline provided a good starting 
point for synthesizing six novel types of hetero-frameworks, all 
skeletally unique [4]. Moreover, scope and limitation studies 
contributed to an understanding of chemistries that would pos- 
sess the robustness necessary for library preparation. Informa- 
tion gained from these experiments was then utilized in the 
construction of a virtual library of 1 1 ,748 compounds. A diver- 
sity analysis was performed using B (Burden) C (CAS) UT 
(Pearlman at the University of Texas) metrics and Tanimoto 
coefficients (Tc) and this virtual compound library was mapped 
onto the existing chemical space of the NIH Molecular 
Libraries Small Molecule Repository (MLSMR) [4]. When 
considering the physical properties most important to bimolec- 
ular interactions, atomic Gasteiger-Hiickel charges, polarizabil- 
ities, and hydrogen-bond acceptors, these virtual compounds 
were found to occupy new chemical space when compared to 
the 327,000 compounds in the MLSMR. A small subset of these 
compounds was subsequently identified as ones representing a 
maximally diverse chemical space. The synthesis of a modified 
subset of this virtual compound library is described within, 
where modifications were mainly driven by studies of com- 
pound stability. Furthermore, a high throughput, in silico 
screening analysis of this library identified a number of poten- 
tial biological targets for the compounds. 

Results and Discussion 

Scaffolds 1, 2 and 3 (Figure 1) were chosen for library prepar- 
ation based upon favorable Tanimoto coefficient (Tc) scores 
when compared to the MLSMR, conformational constraints 



imposed by the P-carboline moiety, and the number of building 
blocks available for the diversifying elements R' and R^. 

The syntheses of tetrahydro-P-carbolines 6(1-16} were accom- 
plished in a manner entirely analogous to that reported previ- 
ously (Table 1, entries 1-16) [4]. For example, the allenic 
methyl ester of tryptophan 4 was reacted with a number of alde- 
hydes 5{1-15} under acidic conditions to produce the corres- 
ponding products in yields ranging from 54-89%. A range of 
aldehydes were accommodated in the Pictet-Spengler reaction, 
including formaldehyde (Table 1, entry 1), alkyl aldehydes 
(Table 1 , entries 2 and 3), aryl aldehydes with electron- with- 
drawing and electron-donating groups (Table 1, entries 4-7, 14 
and 15), heteroaromatic aldehydes (Table 1, entries 8-13) and 
glyoxalates (Table 1, entry 16). Moreover, useful quantities of 
|3-carboline-containing products were obtained (43-100 mg). 
For entries 2-15, mixtures of two diastereomers were obtained. 
Since the mixtures could not be readily separated by column 
chromatography, diastereomeric ratios were determined by 
NMR and were advanced without further purification. Reac- 
tion of allene 6 under the silver-nitrate-mediated cyclization 
conditions afforded the desired fiised pyrrolines 1. However, in 
the initial phases of this cyclization process, a color change was 
noted during the purification process. Indeed, when NMR 
stability studies were performed on the syn- and anti-pyrrolines 
1{5}, decomposition of both diastereomers was evident. 
Although it was generally difficult to isolate the individual dia- 
stereomers, they could be separated by column chromatog- 
raphy. It was found that anti-l{5} decomposed more rapidly 
than syn-l{5} during the ^H NMR stability studies, when 
compared to an internal standard. These results combined with 
previously reported skeletal reorganization processes of func- 
tionalized P-carbolines, led to concerns about the long-term 
storage of these compounds and their inclusion in the MLSMR 
[5]. 

To increase the stability of this class of compounds, a toluene- 
sulfonyl group was added to the indole nitrogen of l{l-7} to 
give Af-tosyl-tetrahydro-P-carbolinepyrroline derivatives 
7{l-7}. These tosylated derivatives exhibited improved stability 
as evidenced by ^H NMR (Table 1, entries 1-7, and Supporting 
Information File 1, S76-S81). Incorporation of the tosyl group 
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1 ) AgNOs, acetone 
rt, 16 h, dark 
degassed 

2) TsCI, NaOH 
TEBA, CH2CI2 
ultrasound, rt, 1 h 




1 (R = H) 
7 (R = Ts) 



entry 



yield of 6«{%) 



68 6{1} 
74 6{2} 

81 6{3} 



antksyn'' 



NA 
2.5:1 

2.1:1 



yield of 7^ (%) 



28 7{1} 
17syn-7{2} 

23 syn-7{3} 



purity'^ 



98% 
98% 

98% 



73 6{4} 
75 6{5} 

74 6{6} 



2.4:1 



2.4:1 



1.3:1 



1 1 anti-7{4} 
28 syn-7{4} 

34 anti-7{5} 
17 syn-7{5} 

35 anti-7{6} 
10syn-7{6} 



98% 
98% 

98% 
98% 

98% 
98% 



-OMe 



OMe 



5{7} 



5{8} 



54 6{7} 



89 6{8} 



2.7:1 



2.3:1 



12anti-7{7} 
17syn-7{7} 



ND<^ 



98% 
98% 



5(9} 



78 6{9} 



1.2:1 



ND^ 



10 



5{10} 



80 6(10} 



1:1.1 



ND^ 



11 




5(11} 



70 6{11} 



1.3:1 



ND^ 



12 




5(12} 



80 6{12} 



3.5:1 



ND^ 



13 



14 





5{13} 



5{14} 



63 6{13} 



58 6{14} 



1:1.4 



2.3:1 



ND^ 
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15 



16 




V 



.COzEt 



5(15} 



5{16} 



80 6{15} 



88 6{16} 



2.3:1 



1.4:1 



8syn-1{16} 



99% 



^Isolated yields; ''dr determined by NMR; '^purity estabiished by LCIVIS/ELS; ''compound 6 detected; "^compound 6 not detected. 



also eased the chromatographic separation of the syn- and anti- 
isomers for entries 2-7, thus compounds 7(2-7} were obtained 
as single diastereomers. Low to moderate yields for this two- 
step reaction sequence were attributed to a problematic tosyla- 
tion due to the sterically hindered nature of the indole nitrogen 
atom. Moreover, unforeseen limitations were encountered for 
the heteroaromatic and naphthyl-containing P-carboline inter- 
mediates (Table 1, entries 8-15). While in some cases the inter- 
mediate pyrrolines 1 were observed (Table 1, entries 8, 10, 14, 
15), the corresponding tosylated products were not obtained. 
The heteroaromatic examples (Table 1, entries 9, 11-13), did 
not undergo cyclization upon treatment with silver nitrate. For 
these cases, it was assumed that competing coordination of the 
heteroatom to the silver ion was an issue; however, attempts 
were not made to alter the reaction conditions for these 
substrates. Furthermore, conversion of the naphthalene- 
containing analogues 1(14} and 1{15} to their corresponding 
tosylates was not successful. 

Next, compounds possessing the cyclobutene-fused P-carboline 
skeleton were assembled from the versatile allenyl intermediate 
6. For this subset of compounds, acylation of the sterically 
hindered amine of 6{1} with the ynoic acids 8{l-8}, by using 
bromo-tris-pyrrolidino-phosponium hexafluorophosphate 
(PyBroP), provided the requisite allene-yne substrates 
9{1,2}-9{1,8}. For the coupling reaction of the ynoic acids with 
an aryl group on the terminus of the alkyne 8{5-8}, the desired 
allene-ynes 9{1,5}-9{1,8} were afforded along with the [2 + 2] 
cycloadducts 2{1,5}-2{1,8} (Table 2, entries 5-8). Previous 
studies with related tryptophan-substituted allene-ynes, required 
much higher temperatures (225 °C versus rt) to give the [2 + 2] 
cycloadducts, albeit none of these examples possessed two 
radical stabilizing groups on the alkyne. Because the calcula- 
tions performed by Tantillo [6] regarding the thermal [2 + 2] 
cycloaddition reaction suggest that the energy barrier for this 
reaction should not be effected by the presence of two radical 
stabilizing groups over one, it was postulated that this cycload- 
dition process was facilitated by exposure to incident light [7]. 
Thus, the mixtures of compounds 2 and 9 were reconstituted in 
CH2CI2 and placed in front of two 6 W UV lamps for 16 h at rt 
to afford the desired cyclobutenes (Method B, Table 2, entries 



5-8). During optimization of the conditions for the [2 + 2] 
cycloaddition reaction (Method A), it was found that reducing 
the reaction temperature from 225 °C to 160 °C afforded 
cyclobutene 2(1,3} in 73% yield (Table 2, entry 3). Similarly, 
allene-yne 9(1,5} was subjected to the lower reaction tempera- 
ture (Method A) to produce cyclobutene 2(1,5} in 57% yield 
(Table 2, entry 5). 

For the final library scaffold, a small subset of a-methylenecy- 
clopentenone-containing tetrahydro-P-carbolines was synthe- 
sized. These compounds contain a general substructure that has 
recently been shown to inhibit DNA damage checkpoints [8]. 
Allene-ynes 9(1,3}-9(1,4} undergo Pauson-Khand cyclocar- 
bonylation reactions when treated with molybdenum hexacar- 
bonyl in DMSO/toluene solutions (Table 3). Allene-yne 9(1,2} 
afforded a mixture of four compounds comprising two dia- 
stereomers of 3(1,2}, the 4-alkylidene cyclopentenone, resulting 
from the cyclocarbonylation reaction with the distal double of 
the allenc, and a fourth compound, which could not be identi- 
fied (see spectral data in Supporting Information File 1). Aryl- 
substituted alkynones 9(1,5}-9(1,8} were not available for the 
molybdenum-mediated cyclocarbonylation process due to 
competing [2 -I- 2] cycloaddition reactions (Table 2). 

The majority of these P-carboline-containing products exhibit 
acceptable calculated physical-chemical properties in accor- 
dance with Lipinski's rule of five (Figure 2) [9,10]. These 
favorable properties and structural novelty make these valuable 
candidates for deposition in the MLSCN for biological activity 
evaluation. 

Diversity-oriented synthesis (DOS) has been employed to 
generate thousands of the organic compounds that have been 
deposited in the NIH molecular repository for medicinal chem- 
istry research. Deciphering the therapeutic potential of this 
many compounds is a continuing challenge. By combining 
chemogenomics databases, such as Protein Data Base (PDB) 
and ChEMBL, it is possible to map new compounds into 
existing chemical space and to predict protein targets for new 
compounds, for which there are two complementary strategies 
that can be implemented. One is a structure-based docking 
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entry 



R2 



yield of 9^ (%) 



method'' 



yield of 2^ (%) 



purity'^ 



|-H 
l-TMS 

\-Me 



S- 



-CF3 

OMe 
Me 



8{1} 

8{2} 
8{3} 
8{4} 

8{5} 
8{6} 
8{7} 

8(8} 



0 9(1,1} 

64 9{1,2} 

75^ 9{1,3} 

71 9{1,4} 

9(1,5} 
2(1,5} 

9(1,6} 
2(1,6} 

9(1,7} 
2(1,7} 

9(1,8} 



A 

A 

A 

A 
B 



73 2(1,3} 
57 2(1,4} 

57f2(1,5} 

40 2(1,5} 

41 2(1,6} 
30 2(1,7} 

33 2(1,8} 



99% 

99% 

99% 
99% 
99% 
99% 



^Isolated yield; ''method A: (jW, 160 °C, DMF, 10 min; method B: Placed in front of two 6 W UV lamps (245 nm), CH2CI2, rt, 16 h, no stirring; ^purity 
established by LCMS/ELS; <^ND = not detected; ^pW, 225 °C, DMF, 7 min, (39%); 'separated 9(1,5} (57% yield) from 2(1,5} (18% yield) after the 
coupling reaction then submitted 9(1,5} to method A to give 2(1,5} in 68% yield. This compound was recombined with the previously isolated 2(1,5} to 
afford a combined 57% yield. 



Strategy, in which a query compound is fit into a series of a collection of bioactive compounds are identified. In the 

protein binding pockets to identify favorable compound-protein present study, both of these strategies were used to predict 

interactions. A second approach is a ligand-based strategy in potential targets of the newly synthesized library of P-carboline- 

which the structural similarities between a query compound and containing compounds. 



Table 3: Synthesis of a-methylenecyclopentenone library 3(1 ,3}-3{1 ,4}. 






II 


0 






Me02C 


! /// 

/ / Mo(CO)6 


Me02C r—/ 






YV 

9 


X) PhMe/DMSO 
95 °C, 2 h 


N'^O 
CXnH 3 




entry 






yield of 3^ (%) 


purity'' 


1 |-Me 




9(1,3} 


48 3(1,3} 


99% 


\^ 




9(1,4} 


46 3(1,4} 


99% 


^Isolated yields; ''purity determined by LCMS/ELS. 



1052 



Beilstein J. Org. Chem. 2012, 8, 1048-1058. 







1{5} 



7{1} 



^378.8, ''4, <=5.10, ^48 M22.5, "7, =3.02, <^75 



7{2} 

^462.6, "7, =3.97, ^64 



7{3} 

^478.6, "7, =4.32, <^65 






N Q-OMe 



OMe 



anti-7{4} 
^498.6, "7, =4.13, ''66 
syn-7{4} 

=4.49, ^65 



anti-7{5} 
^533.0, "7, =4.51 , %6 
syn-7{5} 

=4.98, ''65 



anti-7{6} anti-7{7} 

«566.6, ''7, =4.93, ^66 ^558.6, "9, =4.13, "81 
syn-7{6} syn-7{6} 

=5.71, ''eS =4.62, =181 




'COzEt 




syn-1{16} 



2{1,4} 



2{1,5} 



^340.4, "4, =3.79, "80 ^360.4, "5, =4.06, "72 ^396.4, "5, =4.51 , "75 



2{1,6} 

3464.4, "5, =5.45, "75 




2(1,7} 

^426.5, "6, =4.59, "83 



2(1,8} 

^416.5, "5, =5.02, "72 



3(1,3} 3(1,4} 

^362.4, ''8, =2.50, "104 ^388.4, "8, =3.24, "99 



Figure 2: Library of tetrahydro-p-carboline containing compoi^^^^ and calculated properties (^molecular weight; ''hydrogen-bond donor/accep- 



High-throughput docking studies for protein- 
target prediction of newly synthesized com- 
pounds 

Molecular docking studies were performed with the 34 newly 
synthesized compounds, represented by scaffolds 1, 2, 3, and 
allenyl precursors 4, 6{1-16}, 9(1-4} to identify potential 
protein targets [11]. Protein structures were downloaded from 
the PDB [12] and the analysis was limited to a selection of the 



607 proteins defined as "druggablc" targets, in order to reduce 
computational time [13]. (The complete listing of these proteins 
and their PDB IDs are provided in Supporting Information 
File 2). The Surflex-dock module of the Sybyl software was 
employed for protein preparation and docking of the P-carbo- 
line library [14,15]. Water molecules and ligands were removed 
from the protein structures and the active site of each protein 
was defined by the corresponding residues around the cocrystal- 
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lized ligands. In-house algorithms were used to evaluate ligand- 
docking efficiency, and docking scores were used to assess and 
rank the protein targets. 



A portion of the protein-scoring matrix is illustrated in Figure 3. 
Several interesting results emerged from this in silico analysis: 
(1) Twenty of the new compounds have docking scores greater 



PDB ID 


A 


4 


7{1} 


7(7} 


2(1,3} 




2(1,4} 


2(1,5} 


2(1,6} 


2(1,7} 


2(1,8} 


Inuh 


7.23 


6.45 


-4.46 


-25.14 




3.98 




3.49 


5.4 


3.91 


6.17 


1.79 


li7b 


3.07 


7.32 


1.74 


0.9 




4.6 




5.28 


4.24 


3.49 


3.84 


2.66 


2xfo 


-0.07 


7.29 


-4.96 


-12.59 




-4.95 




2.88 


1.81 


-0.33 


0.45 


0.99 


ld4l 


5.78 


7.17 


3.06 


3.08 




3.68 




3.55 


3.31 


4.98 


3.19 


2.29 


2vgb 


3.13 


7.14 


-7.1 


-17.8 




2.7 




1.92 


0.96 


-0.13 


-2.08 


0.68 


2qro 


4.7 


7.06 


-21.42 


-51.03 




-3.69 




-2.23 


-2.36 


-9.21 


-13.18 


-7.52 


3gqy 


4.98 


4.96 


7.75 


0.67 




3.66 




4.38 


3.3 


0.84 


-1.85 


1.54 


2vqq 


5.24 


5.07 


1.31 


7.14 




6.74 




7.59 


7.69 


7.85 


7.99 


7.03 


ls9i 


3.99 


5.26 


3.84 


0.35 




7.44 




5.09 


3.03 


4.32 


3.52 


2.15 


2q6b 


4.76 


5.49 


5.15 


-5.63 




7 




6.03 


7.92 


4.47 


3.92 


2.84 


2v3e 


4.93 


4 


2.54 


-4.43 




5.43 




5.3 


7.23 


7.54 


7.46 


5.96 


2vgq 


5.04 


6.27 


3.42 


1.9 




3.76 




3.87 


4.02 


7.95 


4.35 


3.04 


3hhm 


4.5 


4.01 


2.4 


1.95 




3.97 




5.42 


6.65 


7.63 


2.71 


3.73 


3hdn 


4.39 


4.22 


3.63 


1.21 




2.68 




2.86 


2.29 


7.56 


6.14 


1.92 


3g7w 


3.89 


5.77 


3.31 


1.84 




4.92 




3.97 


5.2 


7.27 


4.8 


4.43 


3eqc 


6.69 


5.33 


1.59 


-0.93 




4.39 




6.36 


5.13 


7.08 


7.15 


6.95 


3fwl 


5.9 


4.39 


1.99 


1.97 




4.03 




4.07 


6.34 


5.18 


7.04 


3.96 


lso2 


5.05 


6.12 


2.3 


1.88 




6.34 




4.21 


5 


6.2 


5.29 


7.22 


PDB ID 


2(1,8} 


3{1,4} 


6(3} 


6{4} 


6(6} 




6(7} 


6(9} 


6(10} 


6(12} 


6(13} 


6(15} 


ld4l 


2.29 


3.02 


3.26 


3.55 


4.55 




3.51 


5.15 


4.83 


7.3 


5.22 


4.1 


lso2 


7.22 


3.84 


3.84 


6.32 


7.25 




6.08 


8.17 


5.78 


6.15 


5.77 


6.43 


3ljr 


5.45 


7.11 


3.35 


4.66 


0.33 




3.81 


3.33 


1.63 


4.32 


6.29 


4.81 


2zv2 


2.53 


3.45 


7.37 


3.36 


2.02 




2.83 


1.37 


4.2 


2.05 


4.38 


1.75 


2on3 


2.58 


2.8 


2.13 


7 


3.02 




6.2 


4.17 


-1.13 


3.28 


3.84 


1.65 


3fxw 


2.59 


3.47 


3.09 


5.02 


5.39 




7.65 


5.13 


4.53 


5.69 


5.94 


3.08 


3mhj 


-5.26 


3.16 


3.5 


2.03 


6.9 




7.17 


4.23 


3.44 


6.6 


5.53 


6.17 


lpq6 


-1.13 


-1.86 


6.35 


5.8 


4.46 




6.91 


7.57 


8.28 


6.32 


7.02 


6.15 


lt40 


2.37 


2.74 


2.84 


3.52 


4.09 




5.66 


4.64 


8.75 


6.17 


1.69 


5.43 


3ipq 


-9.79 


-3.95 


5.21 


5.47 


4.69 




6.55 


6.9 


8.65 


5.44 


4.37 


6 


2gvj 


4.43 


3.1 


3.32 


4.13 


4.67 




3.63 


3.15 


8.47 


-7.88 


4.49 


4.3 


Idgh 


-1.94 


4.06 


5.08 


4.02 


4.75 




4.15 


6.06 


7.74 


3 


7.17 


3.27 


3nos 


4.35 


2.27 


3.31 


3.84 


6.79 




6.67 


3.47 


7.35 


7.27 


5.18 


7.6 


logs 


5.8 


4.21 


3.96 


5.08 


5.07 




5.91 


5.92 


7.24 


5.31 


4.78 


5.7 


lnf7 


2.59 


3.62 


4.63 


4.75 


-1.19 




4.97 


0.46 


7.21 


-5.03 


4.8 


-1.1 


llpg 


3.07 


2.38 


2 


4.86 


4.09 




2.75 


4.45 


4.57 


5.52 


7.01 


2.96 









) 






HO \ 


Y ^w^NH 

0 Vj 


HCf V 

^OH 
0 


o 


HR2 from PDB:2q6b 


2{1,5} 







^O 


HO OH p 










/°^\-^^/^NH 
0 \ J 


TFG from PDB:2vqq 




2{1,7} 



Figure 3: Results of high-throughput docking analysis. Top: A docking-score matrix arranged by compound IDs and PDB IDs: bottom: Structures of 
known ligands HR2 and TGF and the newly synthesized compounds 2(1,5} and 2(1,7). Docking scores larger than 7.0 are red colored and can be 
mapped to values less than 100 nM. The corresponding protein names of PDB IDs and the full docking-score matrix are listed in Supporting Infor- 
mation File 1. 
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than 7.0, a number that can be mapped to K^^ values less than 
100 nM, for several protein targets; (2) six compounds, 7{7} 
and 2{1,4-1,8}» are predicted to be ligands for a single protein, 
human HDAC4 (PDBID:2vqq); (3) compounds 2{1,5} and 
2{1,7} are predicted to be high-affinity ligands for 3-hydroxy-3- 
methylglutaryl-coenzyme A, reductase (PDBID:2q6b) and 
HDAC4, respectively, even though they are structurally 
different from the corresponding cocrystallized ligands HR2 
and TGF; and (4) compounds 2{1,6} and 6{10} are predicted to 
be ligands for a total of 1 5 protein targets. This high number of 
potential protein targets may be due to the electronegative 
trifluoromethyl group on 2{1,6} and the effect it would have on 
the a,P-unsaturated amide and the purported bioactivity of the 
pyrazole group of 6{10}. 

Ligand-based strategy for target prediction 

Ligand-based target prediction algorithms have been developed 
based upon an established medicinal chemistry principle that 
structurally similar compounds, with comparable physical prop- 
erties, should convey related biological properties [16,17]. In 



this study, structural similarities were calculated between the 
compounds of the P-carboline library and the bioactive com- 
pounds in the well-annotated database, ChEMBL version 13, 
the largest publicly available compound-target database, 
containing 1,143,682 distinct compounds, 8,845 targets and 
6,933,068 bioactivity entries from 44,682 publications and 
PubChem bioassays [18,19]. The Openbabel FP2 fingerprint 
was used as a descriptor to assess similarities between mole- 
cules [20]. Tanimoto coefficients were calculated between the 
compounds of the P-carboline library and the ChEMBL data- 
base, and only P-carboline compounds with a Tc greater than 
0.60 were considered for bioactivity analysis. A lower Tc 
threshold was used to identify a larger number of bioactivity 
targets. Table 4 lists the most promising bioactive targets for the 
newly synthesized P-carbolines together with the structurally 
similar lead compounds in ChEMBL along with their reported 
potency and literature citations. Several interesting results 
emerge from the comparison study performed, including a 
number of targets that the compounds should be screened 
against, such as C-C chemokine receptor type 3, gamma- 
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score 



0.79 



0.80 



0.86 
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Mitogen-activated 
protein kinase 1 




CHEIVIBL44295 
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adenocarcinoma cells o. 
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DNA polymerase iota \ 




Potency = 0.794 |jM 
PubChem AID;995 



CHEMBL1360719 



Inhibition = 78% [29] 



Potency - 1.78 [jiM 

PubCtiem 

AID:588590 




0.82 



0.79 



0.85 



aminobutyric acid receptor subunit ganima-2, breast adenocar- 
cinoma cells, 5-hydroxytryptamine receptor 6, angiotensin- 
converting enzyme, and DNA polymerase iota. Moreover, nine 
of the twelve compounds are represented by allene precursors, 
ones that were not originally considered in the diversity 
analysis. 

Conclusion 

A library of 34 P-carboline-containing compounds was 
synthesized utilizing a skeletal diversification strategy. High- 
throughput docking and ligand-based protocols were imple- 
mented to predict potential biological targets of the newly 
synthesized |3-carbolines. The docking approach uses a struc- 
ture-based technology to predict preferred interactions between 
compounds and protein targets, whereas the ligand-based 
method uses ligand similarity coefficients to identify potential 
biological targets. The complementary nature of these two 
protocols is evidenced by the fact that there was no overlap in 
the predicted biological targets. Furthermore, the in silico 
screening of these compounds is intended to add value to the 



library, by directing them to appropriate biological assays. Such 
strategies can also be used to explore the mechanisms of a bio- 
logically active compound in bioassays whose molecular target 
is as of yet unidentified. 



Supporting Information 

Supporting Information File 1 

Experimental procedures and spectral data for compounds 
1{5}, 1{16}, 2{l,3-8}, 3{l,2-4}, 4, 6{1-10}, 6{12-13}, 
6{15}, 7{l-7}, 9{l,2-4}. 
[http://www.beilstein-joumals.org/bjoc/content/ 
supplementary/1 860-5397-8- 1 1 7-S 1 .pdf] 

Supporting Information File 2 

The complete listing of the proteins and their PDB IDs 
(Targets Docking Score Matrix). 
[http://www.beilstein-joumals.org/bjoc/content/ 
supplementary/1 860-5397-8-117-S2.xls] 
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